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1. INTRODUCTION 

Computer simulation is ubiquitous in engineering design and optimization to solve the complex 
system. Metamodeling is known as surrogate modeling or “model of the model” delivers significantly better 
performance for solving a complex system to mitigate computational cost, improve accuracy, and find 
the optimal solution in optimization. Typical types of surrogate modeling that recently used by the researcher 
is polynomial response surface (PRS), kriging or Gaussian Process, radial basis function (RBF), support 
vector regression (SVR), multivariate spline (MARS), and artificial neural network (ANN). Each model of 
surrogate modeling has its advantages and disadvantages depends on the complex function and optimization 
model. Optimization is the method of finding the optimum objective function value or cost function f(x) 
based onthe input variable value given by x. Without using metamodeling, an approximation of function f(x) 
might require a higher number of times for function approximation to find the optimal value of variables in 
cost function f(x). Hence, recent studies from the previous author emphasized that the advantages of 
metamodeling are the reduction of the computational cost [1]-[4]. 

Current trends in metamodeling have led to a proliferation of studies that able to predict the value of 
the objective function and constraints accurately at a new design point without repetition original expensive 
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simulation [5]. Metamodeling based optimization techniques consist of two stages: i) initial design using 
the design of experiment ii) approximation function using metamodeling. In an investigation of metamodel 
techniques, Huachao et al. [6] state that there is another two-step in metamodeling optimization (i) Exploit 
surrogate models to capture the potential optimal points as added samples; (ii) Explore the sparsely sampled 
regions to guarantee a balance between global and local search. The steps in metamodel similar to those 
reported by several previous researchers for exploration and exploitation design space [7]-[9]. Metamodel 
method in the global optimization algorithm can provide a better solution to achieve convergence, improve 
accuracy, robust, and can explore the region without stuck in local valleys. 

A large and growing body of literature study investigated the fact that the application of a 
metamodel in the engineering field agreed that this method provides an optimal solution quickly than any 
other algorithm. [10]—[15]. Review for kriging metamodeling publishes by [16] to present basic kriging 
formula and extend kriging simulation using bootstrap to estimate the variance of kriging predictor. Another 
work also published by the same author review kriging metamodel in experimental design and proposed 
“robust” optimization accounts for uncertainty in some simulation input with the Taguchi method. 
The existing metamodeling review extensively and focuses mainly in engineering optimization problem such 
as complexity of crashworthiness problem [17], simulation complex system of aircraft and aerodynamic 
design [18], architecture design of optimization [3], simulation-based complex engineering design and 
employing surrogate model in robust model optimization (RBO) framework [19]. In short, much of prior 
metamodel studies concentrated on approximate function, finding the location of the observed point to 
improve metamodel accuracy and explore and exploit the region for uncertainty analysis. 

Selection of model in surrogate model optimization is difference based on test function 
approximation and constraint handling. Khairy et al. (2012) investigate comparison two (2) metamodel 
techniques, RBF and Kriging in different aspects such as accuracy, robustness, efficiency, and scalability to 
identify advantages and disadvantages of each meta-modeling technique. His study found that the RBF 
metamodel is more accurate than the kriging metamodel. Another study compared PRS with RBF for 
experimental work demonstrated that computational simulation adopted RBF gets a better result and 
suggested to enhance the number of level in design space to increase the accuracy of metamodel [20]. 
Ostergard’s et al. Comparative study (2018) found that that Kriging produces the most accurate metamodels, 
followed by ANN and MARS but becomes inefficient for large training sets. This author compares six (6) 
major metamodeling techniques such as linear regression with ordinary least squares (OLS), random forest 
(RF), support vector regression (SVR), multivariate adaptive regression splines (MARS), Gaussian process 
regression (GPR), and neural network (NN). The comparison method concerning accuracy, efficiency, ease- 
of-use, robustness, and interpretability for overall metamodel [21]. Selection of metamodel approximation is 
depend on input parameter, number of samples and complexity of the system. Likewise, RBF metamodel 
widely used to predict time series, control parameter of control system and data mining [22, 23]. 

Unlike the comparison of six metamodels, comparative research between kriging and RBF 
metamodeling techniques for design optimization of variable stiffness composites indicates that both models 
are the most precise and robust model in design space exploration. This author showed that Kriging's 
suitability is good for a small number of design variables, while RBF metamodel is the best model for a large 
number of variables [24]. Previous research in 2016 by Vicario compared kriging and ANN metamodel in 
computer experiment indicate that predictive using kriging is acceptable unfortunately the model not satisfy 
the predictive accuracy while ANN also gives acceptable result, but it is hard to understand their inner 
workings [25]. In his case study of the effect, Latin Hypercube sampling on metamodel shows that RBF 
showed the best overall performance in global exploration when comparing to Kriging. For future research 
work, this author suggested further improvement in exploration or exploitation characteristics can be assisted 
using ‘adaptive sampling’ strategies [26]. The evidence reviewed here seems the studies of metamodel 
optimization technique indicate that there no best model suitable for all mathematical function 
approximation. The metamodel employed in optimization based on design function, variable, dimension, and 
constraint handling. 

Therefore, the advantages of using metamodel in optimization problem able to reduce cost and 
improved accuracy. However, not every model of metamodel suitable for all problem in design optimization 
because difference metamodel is shown to perform well indifference problem and condition. On the contrary 
previous research work in metamodeling, ensemble method is another way to fit one or more metamodel and 
select the one metamodel that performs best. The first work adopts an ensemble method in metamodel 
proposed by Goel et al. in 2007, implemented a weighted average based on an error in each metamodel to 
choose the best metamodel [27]. Zhou (2016) in his research implement ensemble method to select suitable 
metamodel for objective-oriented sequential sampling proposed genetic algorithm (GA) to optimal weight 
coefficient for each metamodel [28]. Genetic-based algorithm is a sort of systematic random search by 
imitating evolutionary processes in nature. GA is a technique of combining one chromosome with the finest 
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chromosomes in the current population [29, 30]. Latest work by Li et al. in 2019 proposed ensemble of 
surrogates assisted particle swarm optimization (EAPSO) algorithm for medium scale expensive problems to 
enhance surrogate-assisted evolutionary algorithm because it trapped in local optima [31]. Base on analysis 
of previous research work, new future research direction can be highlighted to address issue adaptive 
sampling in engineering design. 

Based on a comprehensive review, the main objective of this paper to propose a new approach of 
implementing an ensemble method in adaptive sampling stage to enhance sampling technique in order to 
improve metamodel approximation. Instead of implement ensemble method to choose the best metamodel, 
we attempt to propose the method implementing consensus or non-homogeneous ensemble method to voting 
over the best model in sampling new sample. To the best author knowledge, there is no research work 
implementing consensus in adaptive sampling strategy. The contribution of this paper is: (i) proposed method 
of infill sampling criteria which include deterministic and metaheuristic method for new sample selection to 
improve accuracy of metamodel, (ii) employ non-homogenous ensemble method for model voting the best 
location of new sample. 


2. TYPES OF METAMODELING 

The fundamental of metamodel is comes from response surface methodology (RSM) and this 
method strongly related to design of experiment to reduce number of simulation or experiment. In this 
section, only two types of metamodel Kriging and RBF discussed based on suggested by previous literature. 
The development of Kriging metamodel based on statistical method proposed by Daniel Gerhardys Kridge. 
This method purposely developed by expert in the geostatistic field for prediction and suitable for nonlinear 
problem [32]. RBF proposed by Hardy in 1971 designed to handle multivariate data interpolation [33]. 
RBF is one of the algorithm that works well in noisy data, suitable in function approximation, prediction time 
series, and fast convergence [34]. 


2.1 Kriging 

For design input sample points X = [x,, ..., Xm ], (m is the number of sample and N is the number 
of variables) where X € R™*" and the output response Y = [Y,,...,%_, |” with Y € R™” , kriging model 
is the combination of the trend term and the deviation term as (1), 


I(x) = fx) + Z(x) (1) 
where f (x) is the predicted value of the kriging model, f(x) is the real function of x, z(x) is a random process 


that provides global optimization model in design space similar to PRS. The covariance formula can be 
expressed as (2), 


cov[Z(x)',Z(x) ] = oR[R(x!, x] (2) 


R is the correlation matrix, R(X' and Xİ ) is the correlation function of any two samples points X’ and Xİ. 
Kriging metamodel has correlation function such as exponential function, Gaussian function, spline function. 


2.2 Radial basis function 


Given vector of input which N design space [X;, X2, ..., Xn] and its corresponding responses y= 
[Y1, Y2, ..., Yn], an RBF predictor at any point X in design space is given by (3), 


$= f) = Ek wig (lx xil) (3) 
Where œ; is the weight coefficient evaluated by fitting the model to the training data; @(.) is 
the nonlinear basis function, ||.|| denotes the Euclidean distance between two sample points and f is 


the predicted output of the objective function. The basis function weights, w;, can be computed by using (4). 
The equation can write as below 


y= pw (4) 


The optimum value in the second layer weight computed using the least square formula as (5), 
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Wi Yı 
oe he = (g7 vy)" n (5) 
Wy Yy 


Denoted that y represented the vector of function values of training data, w is the vector of basis function 
weight and @ is the matrix design variables defined by (6), 


P(%1,%X1) Plx xX2) G(X, X) 
p = P(x2,x1) P (Xp) %2) p(x xn) (6) 


PX X) Px X1). PXk Xy) 


Where k represented as designed variable and N is the number of the sample points. A typical radial basis 
function is the gaussian expressed by the below (7), 


epee) (7) 


Where X is the input data, C is the center and ß is spread parameter. 


3. ADAPTIVE SAMPLING USING METAMODEL OPTIMIZATION METHOD 

Metamodeling is a computational optimization technique using ‘cheap to run model’ which involve 
4 stages: i) sampling technique ii) approximation function and iii) obtain new sample and iv) refining 
the metamodel. Simpson et al summarize used of each metamodel and fitting alternatives. 
RSM is established, easy to used and suitable for low dimension. Neural Network metamodel is best for high 
nonlinear problem huge samples and recommended for deterministic application. Kriging metamodel suitable 
for low dimension and flexible [10]. 

The framework is shown in Figure 1 which highlights the basic process of metamodeling-based 
optimization. First stage is sampling technique using Design of experiment method. Second stage is choosing 
suitable metamodel method. Third stage is identified infill sampling or adaptive sampling method based on 
metamodel method in approximation stage. If the algorithm does not achieve stopping criteria or required 
sample points, add new sample point and repeat approximation stage until algorithm meets the number of 
samples needed. Finally, compute the validation error such as root mean square error (RMSE) to determine 
accuracy or performance measure of algorithm. 
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Figure 1. Metamodel-based optimization framework 
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3.1. Design of experiment 

Design of Experiment (DoE) is the method to plan and reduce the number of experiments by control 
parameter of variables. DoE method introduced by Myers and Montgomery for the best sampling technique 
to reduce time and cost of real experiments. In metamodel based optimization, DoE method can improve 
the accuracy and mitigate the validation of error. Hence, the accuracy of metamodeling significantly relies on 
method of choosing initial sample points. Good experimental design selections can guide to identify 
inadequacies in the suggested models and prevent bias in the model. A set of the starting point of the entire 
domain is called design space. DoE is used as an effective method to solve computational simulation in 
Metamodel base optimization. DOE is a science that explores the most effective way of structuring the tests 
in order to obtain the maximum information by analyzing the test results. Rui et al list 3 relevant 
nomenclature of DOE: i) Factors - the input design parameters to change in the experiment, ii) Levels- 
the value of the input parameters and iii) Responses - the output associated with design parameters, 
it is a measure of design performance indicators. [20]. Most of author highlight that DoE only used to design 
initial samples before optimization stage in metamodel optimization problem. 

Classic DoE includes full factorial, fractional factorial, central composite design, box-behken 
and orthogonal array. While, modern DoE improves the classical method such as taguchi, latin hypercube 
sampling (LHS), Hammersley and monte carlo. Most of the modern DoE method is to fulfill space-filling 
criteria and improve the optimization of the metamodel. In his case study of effect of LHS in metamodel 
optimization problem, A. Afzhal et al suggested future research direction of the exploration and exploitation 
strategies can be assisted by 'adaptive sampling' for metamodel[26]. DoE used for one-shot sampling in 
experimental design, sequential and adaptive sampling suggest by current researcher for infill sampling 
sample in computer experiment. Sequential sampling method in metamodel is develop from well-known 
one-shot DoE method by infill point in sequential manner. In contrast, adaptive sampling method is chosen 
sample point from previous information of metamodel approximation output. The terms active learning used 
by Settles (2010) described the approach of selecting the most informative sample points using 
the metamodel thus it works better even though with fewer points[35]. Due to this, adaptive sampling 
has the potential to proliferate in research area optimization to refine the model and improve 
accuracy of metamodel. 


3.2. Infill sampling criterion 

Infill sampling approach for metamodeling is designed to refine the current model by adding sample 
point. Various of infill criteria developed for adaptive sampling method to meets exploitation, exploration, 
and balanced exploitation and exploration. There are several infill sampling criterion strategy proposed by 
previous research such as distance-based design, variance-based design, probability-based design, 
and Lipchitz based design [15]. Statistical lower bound, the probability of improvement, expected 
improvement, entropy search, variance-based, lipschitz-based only focused on a statistical model such 
kriging which is not suitable for radial basis function [36]. This suggested that future research need to 
enhance drawback of RBF metamodeling method in optimization. 


3.3. Recent Method in metamodel infill sampling strategy for RBF metamodeling 

More recent studies finding the best method to obtain new sampling data location as summarize in 
Table 2. However, the study still under investigation and improve year by year to enhance sampling method 
using metamodel. Other than that, most of studies in adaptive sampling metamodel optimization method have 
focused on infill sampling approach. There is still limited study ensemble method to obtaining a new sample 
as suggested by [37] to develop more effective infill sampling criteria consider more properties of ensembles. 
Hence, instead of employed ensemble method in metamodel approximation stage, we proposed to implement 
a consensus method for infill sampling stage method to get the best method of obtaining new sample point. 
Selection of right model will improve the performance of metamodel. 

Historically, research in investigating adaptive sampling or sequential sampling technique 
surprisingly increases to enhance the performance of the metamodel. In an investigation of sequential using 
RBF, Havinga et al. conclude that kriging metamodels get better performance when there is no noise is 
present on the black-box model, whereas when RBF interpolation function with multiquadric activation 
function performs best if noise is present. RBF metamodels proved to suffer less from fluctuations in global 
optimization throughout the optimization procedure. Other studies reported that to obtain an RBF metamodel 
with desirable approximation performance can be divided into two types: (1) enhancing the metamodel itself 
and (ii)enhancing the sampling techniques[38]. Previous work by Jin et al. (2001) is the one of prominent 
early study that often cited in research on the development of new sequential sampling approaches for global 
metamodeling to solve the most issue in sequential sampling. Jin et al. suggested a maximin and cross- 
validation strategy with RBF to select current sample points based on location of accessible sample points 
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and refine the metamodel [39]. In 2014, Pan developed method extrema points search of metamodels and 
found minimum points of density function. Taking this integrated approach, he found that his proposed 
method significantly outperforms than previous work. Theunissen and Gjelstrup proposed another an 
adaptive sampling method applied point-wise measurement methods to navigate the two-dimensional spatial 
distribution of sampling point in RBF. The result demonstrates that the new technique tends to be robustness 
in terms of precision and reliability compared to traditional full-factorial sampling [40]. In a study conducted 
by Regis and Shoemaker (2005), a proposed novel method to select the next sample points for complex 
function evaluation that minimizes the current PRS model subject to the constraints and additional constraints 
that the point is lies of some distance from previously evaluated points [41]. Regis (2011) then suggested 
a different method called ConstrLMSRBF to construct the RBF metamodel for both the objective function 
and all the constraint functions in each iteration, these RBF metamodels guided the selection of the next 
sample points where objective and constraint functions are assessed [42]. In continuous research by 
Regis (2014a), algorithms designed are called COBRA and Extended ConstrLMSRBF, used for high- 
dimensional issues which all initial points located in infeasible design space [43]. Recent studies in 2018. Wu 
et al. proposed a new method named RBF-based Constrained Global Optimization (RCGO) to overcome 
computationally cost objective function and inequality constraints. The method is one of the most practical 
ways to obtain the initial points for the case study while an auxiliary objective function is designed to identify 
the next iterative point and improved the metamodel.[44]. All the studies discussed, support the hypothesis 
that using metamodel can improve approximation and process identifying a new sample point could enhance 
the model's precision by using adaptive sampling technique as shown in Table 1. 


Table 1. Summary of recent studies infill sampling strategy 








Author Infill Sampling Method Dimension Complex function 
Guang Pan et al. [45] Density based 4" dimensions 
Cai et al [46] hybrid with cut- HDMR 16" dimensions 
Amouzgar et al [33] posteriori bias 16" dimensions 
Cai et al [47] Cross validation 6" dimensions 
Jin et al [39], Regis and Shoemaker Distance based Low dimension 
[48][41], Crombeq[49], 
Regis [50], Bajaj et al [51] Local model trust region 15" dimensions 
Zhou [52] Hybrid with self organizing map 5" dimension 
Mackman and Allen [53] Laplacian criterion Low dimension 
Theunissen and Gjelstrup [40] Pointwise measurement Low dimension 
Iuliano [7] Proper Orthogonal Decomposition Low Dimension 
Khalfallah et al [4] NGSA II 8" dimension 
Wang and Ierapetritou [54] Error based 5" and 6" dimension 
Li et al [55] Adjusting shape parameter RBF 20" dimension 





4. PROPOSED METHOD 

Based on comprehensive literature studies, the research gap identified infill sampling strategy to 
obtain new sample point. In this investigation, the method proposed to compare four infill strategy methods 
that involve distance-based method, hybrid metamodel with deterministic method, hybrid metamodel with 
metaheuristic method and cross-validation. The proposed method should meet exploitation and exploration in 
design space. There is a various method of infill sampling criterion to refine the metamodel such as 
deterministic and stochastic. The deterministic method includes trust region, line search, elimination 
Lagragarian, etc. Besides, stochastic implementation method for infill sampling strategies such as genetic 
algorithm, evolutionary algorithm, particle swarm optimization, and evolutionary algorithm. Construction 
metamodel with a metaheuristic method purposed to outcomes limitation for high dimension problems. 

Ensemble method is the way to select the most metamodel perform the best among the model that 
used. Goel et al identify two (2) benefit of using ensemble in metamodel: i) identify region with high error 
and ii) robust optimization approach [27]. A study by P. Ye et al implement three metamodel techniques with 
optimized weight factors is used for the selection of promising sample points, the narrowing of space 
exploration and the identification of the global optimum [56]. Ensemble method can be classified into 
non-homogeneous and homogenous method. Non-homogeneous method is referred to agreement of expert or 
consensus. The proposed design method shows in Figure 2 implement non-homogenous or known as 
consensus (agreement of expert) for infill sampling strategy. This proposed method is an interesting topic 
regarding model comparisons is defining which measure of model performance. 
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Figure 2. Proposes method of metamodel adaptive sampling using consensus 


5. CONCLUSION AND FUTURE RESEARCH PROSPECT 

The most challenge issue of the metamodel when area of the design space, the computational 
demand, and the required number of points exponentially increase with dimensions of the input variables 
which is known as the “curse of dimensionality”. The proposed method using a consensus of metamodels can 
improve robustness of the predictions sample location based on best model voting. Current studies, 
implementing ensemble method in approximation stage to choose the best metamodel before algorithm 
allows to select sample point and refine the metamodel. Comparative study between kriging and RBF shows 
that kriging highly outperforms for low design variable while RBF achieves better performance in high 
design variables. Future research direction for adaptive sampling in metamodel should consider impact of 
kernel function, multi-objective optimization, and employed multiple infill sampling for each iteration 
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